'1 + 1 > 2': Merging Distance and Density Based Clustering
نویسندگان
چکیده
Clustering is an important data exploration task. Its use in data mining is growing very fast. Traditional clustering algorithms which no longer cater to the data mining requirements are mod#ed increasingly. Clustering algorithms are numerous which can be divided in several categories. Two prominent categories are distance-based and density-based (e.g. K-means and DBSCAN, respectively). While K-means is fast, easy to implement, and converges to local optima almost surely, but it is also easily affected by noise. On the other hand, while density-based clustering canjind arbitrary shape clusters and handle noise well, but it is also slow in comparison due to neighborhood search for each data point, and faces difficulty in setting density threshold properly. In this paper; we propose BRIDGE that eflciently merges the two by exploiting the advantages of one to counter the limitations of the other and vice versa. BRIDGE enables DBSCAN to handle very large data efficiently and improves the quality of K-means clusters by removing the noisy points. It also helps the user in setting the density threshold parameter properly. We further show that other clustering algorithms can be merged using similar strategy. An example given in the paper merges BIRCH clustering with DBSCAN.
منابع مشابه
Merging Similarity and Trust Based Social Networks to Enhance the Accuracy of Trust-Aware Recommender Systems
In recent years, collaborative filtering (CF) methods are important and widely accepted techniques are available for recommender systems. One of these techniques is user based that produces useful recommendations based on the similarity by the ratings of likeminded users. However, these systems suffer from several inherent shortcomings such as data sparsity and cold start problems. With the dev...
متن کاملLocal Density-based Hierarchical Clustering for Overlapping Distribution using Minimum Spanning Tree
In this paper, we propose a clustering algorithm to find clusters of different sizes, shapes and densities. Density and Hierarchical based approaches are adopted in the algorithm using Minimum Spanning Tree, resulting in a new algorithm – Local Density-based Hierarchical Clustering Algorithm for overlapping data distribution using Minimum Spanning Tree (LDHCODMST). The algorithm is divided into...
متن کاملLocal Density-based Hierarchical Clustering for Overlapping Distribution using Minimum Spanning Tree
In this paper, we propose a clustering algorithm to find clusters of different sizes, shapes and densities. Density and Hierarchical based approaches are adopted in the algorithm using Minimum Spanning Tree, resulting in a new algorithm – Local Density-based Hierarchical Clustering Algorithm for overlapping data distribution using Minimum Spanning Tree (LDHCODMST). The algorithm is divided into...
متن کاملAssessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories
In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...
متن کاملA quick and coarse color image segmentation
In this paper we focus on the problem of image segmen-tation by color classification. We present a robust agglomerating clustering algorithm based on a cluster validity criteria derived from fuzzy partitions. The result is a simplified segmentation having a small number of large regions. The interest of the proposed method is that it requires a single parameter and that the computational comple...
متن کامل